14 research outputs found

    Cooperative partitioning: Energy-efficient cache partitioning for high-performance CMPs

    Get PDF

    A Common Left Occipito-Temporal Dysfunction in Developmental Dyslexia and Acquired Letter-By-Letter Reading?

    Get PDF
    We used fMRI to examine functional brain abnormalities of German-speaking dyslexics who suffer from slow effortful reading but not from a reading accuracy problem. Similar to acquired cases of letter-by-letter reading, the developmental cases exhibited an abnormal strong effect of length (i.e., number of letters) on response time for words and pseudowords.Corresponding to lesions of left occipito-temporal (OT) regions in acquired cases, we found a dysfunction of this region in our developmental cases who failed to exhibit responsiveness of left OT regions to the length of words and pseudowords. This abnormality in the left OT cortex was accompanied by absent responsiveness to increased sublexical reading demands in phonological inferior frontal gyrus (IFG) regions. Interestingly, there was no abnormality in the left superior temporal cortex which--corresponding to the onological deficit explanation--is considered to be the prime locus of the reading difficulties of developmental dyslexia cases.The present functional imaging results suggest that developmental dyslexia similar to acquired letter-by-letter reading is due to a primary dysfunction of left OT regions

    Vectorization-aware loop unrolling with seed forwarding

    Get PDF
    Loop unrolling is a widely adopted loop transformation, commonly used for enabling subsequent optimizations. Straight-line-code vectorization (SLP) is an optimization that benefits from unrolling. SLP converts isomorphic instruction sequences into vector code. Since unrolling generates repeatead isomorphic instruction sequences, it enables SLP to vectorize more code. However, most production compilers apply these optimizations independently and uncoordinated. Unrolling is commonly tuned to avoid code bloat, not maximizing the potential for vectorization, leading to missed vectorization opportunities. We are proposing VALU, a novel loop unrolling heuristic that takes vectorization into account when making unrolling decisions. Our heuristic is powered by an analysis that estimates the potential benefit of SLP vectorization for the unrolled version of the loop. Our heuristic then selects the unrolling factor that maximizes the utilization of the vector units. VALU also forwards the vectorizable code to SLP, allowing it to bypass its greedy search for vectorizable seed instructions, exposing more vectorization opportunities. Our evaluation on a production compiler shows that VALU uncovers many vectorization opportunities that were missed by the default loop unroller and vectorizers. This results in more vectorized code and significant performance speedups for 17 of the kernels of the TSVC benchmarks suite, reaching up to 2× speedup over the already highly optimized -O3. Our evaluation on full benchmarks from FreeBench and MiBench shows that VALU results in a geo-mean speedup of 1.06×
    corecore